Improving ZooKeeper Atomic Broadcast Performance When a Server ’orum Never Crashes

نویسندگان

Ibrahim EL-Sanosi

Paul Ezhilchelvan

چکیده

Operating at the core of the highly-available ZooKeeper system is the ZooKeeper atomic broadcast (Zab) for imposing a total order on service requests that seek to modify the replicated system state. Zab is designed with the weakest assumptions possible under crash-recovery fault model; e.g., any number even all of servers can crash simultaneously and the system will continue or resume its service provisioning when a server quorum remains or resumes to be operative. Our aim is to explore ways of improving Zab performance without modifying its easy-to-implement structure. To this end, we ÿrst assume that server crashes are independent and a server quorum remains operative at all time. Under these restrictive, yet practical, assumptions, we propose three variations of Zab and do performance comparison. ⁄e ÿrst variation o‡ers excellent performance but can be only used for 3-server systems; the other two do not have this limitation. One of them reduces the leader overhead further by conditioning the sending of acknowledgements on the outcomes of coin tosses. Owing to its superb performance, it is re-designed to operate under the least-restricted Zab fault assumptions. Further performance comparisons conÿrm the potential of coin-tossing in o‡ering performances better than Zab, particularly at high workloads.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tail Latency in ZooKeeper and a Simple Reimplementation

ZooKeeper [1] is a commonly used service for coordinating distributed applications. ZooKeeper uses leader-based atomic broadcast for writes, so that all state modifications are globally totally ordered, but it allows stale reads from any server for high read availability. This design trades high read throughput for potentially high write latency. Unfortunately, the extent of this tradeoff and t...

متن کامل

On Barriers and the Gap between Active and Passive Replication

Active replication is commonly built on top of the atomic broadcast primitive. Passive replication, which has been recently used in the popular ZooKeeper coordination system, can be naturally built on top of the primaryorder atomic broadcast primitive. Passive replication differs from active replication in that it requires processes to cross a barrier before they become primaries and start broa...

متن کامل

ZooKeeper’s atomic broadcast protocol: Theory and practice

Apache ZooKeeper is a distributed coordination service for cloud computing, providing essential synchronization and group services for other distributed applications. At its core lies an atomic broadcast protocol, which elects a leader, synchronizes the nodes, and performs broadcasts of updates from the leader. We study the design of this protocol, highlight promised properties, and analyze its...

متن کامل

Comparison of Failure Detectors and Group Membership: Performance Study of Two Atomic Broadcast Algorithms

Protocols that solve agreement problems are essential building blocks for fault tolerant distributed systems. While many protocols have been published, little has been done to analyze their performance, especially the performance of their fault tolerance mechanisms. In this paper, we present a performance evaluation methodology that can be generalized to analyze many kinds of fault-tolerant alg...

متن کامل

Consensus in a Box: Inexpensive Coordination in Hardware

Consensus mechanisms for ensuring consistency are some of the most expensive operations in managing large amounts of data. Often, there is a trade off that involves reducing the coordination overhead at the price of accepting possible data loss or inconsistencies. As the demand for more efficient data centers increases, it is important to provide better ways of ensuring consistency without affe...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Improving ZooKeeper Atomic Broadcast Performance When a Server ’orum Never Crashes

نویسندگان

چکیده

منابع مشابه

Tail Latency in ZooKeeper and a Simple Reimplementation

On Barriers and the Gap between Active and Passive Replication

ZooKeeper’s atomic broadcast protocol: Theory and practice

Comparison of Failure Detectors and Group Membership: Performance Study of Two Atomic Broadcast Algorithms

Consensus in a Box: Inexpensive Coordination in Hardware

عنوان ژورنال:

اشتراک گذاری